Skip to content

Conversation

TomAugspurger
Copy link
Contributor

This moves the definition of the functions passed to _is_dtype_type from dynamically generated functions is is_integer_dtype to top-level functions. I can't run asv right now (#33315 (comment)) but here are some timeits

master

In [2]: t = np.dtype('int64')

In [3]: %timeit pd.api.types.is_integer_dtype(t)
2.22 µs ± 51.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

This PR

In [3]: %timeit pd.api.types.is_integer_dtype(t)
1.54 µs ± 39.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

@TomAugspurger TomAugspurger added Performance Memory or execution speed performance Dtype Conversions Unexpected or buggy dtype conversions labels Apr 7, 2020
@TomAugspurger TomAugspurger added this to the 1.1 milestone Apr 7, 2020
issubclass(tipo, klasses)
and not issubclass(tipo, (np.datetime64, np.timedelta64))
)
_object_classes = lambda tipo: issubclass(tipo, np.object_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better as a dict i would think

@jreback
Copy link
Contributor

jreback commented Apr 7, 2020

in fact could simply instantiate classes

@jbrockmendel
Copy link
Member

Nice, I didnt expect classes to account for that much overhead.

@WillAyd
Copy link
Member

WillAyd commented Apr 7, 2020

I can't run asv right now (#33315 (comment)) but here are some timeits

Hmm don't think that issue is the root of any ASV issues - are you getting any other error messages from the run? Maybe worth opening a dedicated issue

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move the logic into the is_* functions themselves

@TomAugspurger
Copy link
Contributor Author

Which logic would you move? It seems like _is_dtype_or_type requires a callable, and I want to keep the definition of that callable out of e.g. is_bool_dtype.

False
"""
return _is_dtype_type(arr_or_dtype, classes(np.timedelta64))
return _is_dtype_type(arr_or_dtype, _timedelta64_classes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example instead of creating the module level variable I would simply hardcode:
lambda tipo: issubclass(tipo, np.timedelta64) right here. its much more obvious.

@jbrockmendel
Copy link
Member

@TomAugspurger is this still something you want to merge, or is it superceded by #33400?

@TomAugspurger
Copy link
Contributor Author

TomAugspurger commented Apr 16, 2020 via email

@jbrockmendel
Copy link
Member

needs rebase

@jreback
Copy link
Contributor

jreback commented May 25, 2020

ok with this if we can update to comments.

@jreback
Copy link
Contributor

jreback commented Jun 14, 2020

conceptually ok with this if you can update to comments

@jreback jreback removed this from the 1.1 milestone Jul 9, 2020
@jreback
Copy link
Contributor

jreback commented Jul 17, 2020

closing as stale

@jreback jreback added this to the No action milestone Jul 17, 2020
@jreback jreback closed this Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Dtype Conversions Unexpected or buggy dtype conversions Performance Memory or execution speed performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants